An Open Combinatorial Diffraction Dataset Including Consensus Human and Machine Learning Labels with Quantified Uncertainty for Training New Machine Learning Models
نویسندگان
چکیده
Modern machine learning and autonomous experimentation schemes in materials science rely on accurate analysis of the data ingested by these models. Unfortunately, underlying can be difficult, even for domain experts, complicating training models intended to drive experiments. This is especially true when goal identify presence weak signatures diffraction or spectroscopic datasets. In this work, we examine a set as-obtained that track phase transition from monoclinic tetragonal Nb-doped VO2 film as function temperature dopant concentration. We then task experts with identifying which present each pattern manually algorithmically, respectively; both cases, labels vary dramatically, at boundaries. use mode Shannon entropy method capture, preserve propagate consensus their variance. Further expert benchmark demonstrate weighted scoring test performance generated labels. Finally, propose material challenge centered around generating improved labeling algorithms. real-world dataset curated act bed new The raw data, annotations code used study are all available online data.gov interested reader encouraged replicate improve existing
منابع مشابه
Machine-learning models for combinatorial catalyst discovery
A variety of machine learning algorithms, including hierarchical clustering, decision trees, k-nearest neighbours, support vector machines and bagging, were applied to construct models to predict the molecular weight of the polymers produced by a set of 96 homogeneous catalysts. The goal of the study was to develop models that could be used to screen large virtual libraries of catalysts in orde...
متن کاملMachine Learning Models for Housing Prices Forecasting using Registration Data
This article has been compiled to identify the best model of housing price forecasting using machine learning methods with maximum accuracy and minimum error. Five important machine learning algorithms are used to predict housing prices, including Nearest Neighbor Regression Algorithm (KNNR), Support Vector Regression Algorithm (SVR), Random Forest Regression Algorithm (RFR), Extreme Gradient B...
متن کاملDust source mapping using satellite imagery and machine learning models
Predicting dust sources area and determining the affecting factors is necessary in order to prioritize management and practice deal with desertification due to wind erosion in arid areas. Therefore, this study aimed to evaluate the application of three machine learning models (including generalized linear model, artificial neural network, random forest) to predict the vulnerability of dust cent...
متن کاملdevelopment and implementation of an optimized control strategy for induction machine in an electric vehicle
in the area of automotive engineering there is a tendency to more electrification of power train. in this work control of an induction machine for the application of electric vehicle is investigated. through the changing operating point of the machine, adapting the rotor magnetization current seems to be useful to increase the machines efficiency. in the literature there are many approaches wh...
15 صفحه اولCombinatorial Clustering for Textual Data Representation in Machine Learning Models
In text stream analysis one of the main problems is finding an effective method to classify documents fast and correctly. This is the reason why dimensionality reduction and related methods of representation of significant information are critical to develop a good text classifier. In this report we describe a novel purely combinatorial approach to obtain a meaningful representation of text dat...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Integrating materials and manufacturing innovation
سال: 2021
ISSN: ['2193-9764', '2193-9772']
DOI: https://doi.org/10.1007/s40192-021-00213-8